Automatically Adapting Source Code to Document Provenance

نویسنده

  • Simon Miles
چکیده

Being able to ask questions about the provenance of some data requires documentation on each influence on that data’s existence and content. Much software exists, and is being developed, for which there is no provenance-awareness, i.e. at best, the data it outputs can be connected to its inputs, but with no record of intermediate processing. Further, where some record of processing does exist, e.g. as logs, it is not in a form easily connected with that of other processes. We would like to enable compiled software to record useful documentation without requiring prior manual adaptation. In this paper, we present an approach to adapting source code from its original form without manual manipulation, to record information on data provenance during execution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Provenance of Publications: A PROV Style for LaTeX

In general, the task of generating provenance is still tedious, and the community still lacks tools to generate provenance easily. In particular, when writing papers, researchers should be able to produce the provenance of their papers, make it available online, and embed provenance metadata directly in their PDF files. To address this goal, we introduce prov.sty, a PROV style for LTEX, allowin...

متن کامل

Reconstructing Provenance

Provenance is an increasingly important aspect of data management that is often underestimated and neglected by practitioners. In our work, we target the problem of reconstructing provenance of files in a shared folder setting, assuming that only standard filesystem metadata are available. We propose a content-based approach that is able to reconstruct provenance automatically, leveraging sever...

متن کامل

Revealing the Detailed Lineage of Script Outputs using Hybrid Provenance

We illustrate how combining retrospective and prospective provenance can yield scientifically meaningful hybrid provenance representations of the computational histories of data produced during a script run. We use scripts from multiple disciplines (astrophysics, climate science, biodiversity data curation, and social network analysis), implemented in Python, R, and MATLAB, to highlight the use...

متن کامل

Layering in Provenance-Aware Storage Systems

Digital provenance describes the ancestry or history of a digital document. Provenance provides answers to questions such as: “How does the ancestry of these objects differ?” “Are there source code files tainted by proprietary software?” “How was this object created?” Prior systems used to collect and maintain provenance operate within a single layer of abstraction: the system call boundary, a ...

متن کامل

Supporting dynamic pipeline changes using Class-Based Object Versioning in Astro-WISE

Understanding the difference between data objects is a major problem especially in a scientific collaboration which allows scientists to collectively reuse data, modify and adapt scripts developed by their peers to process data while publishing the results to a centralized data store. Although data provenance has been significantly studied to address the origins of a data item, it does not howe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010